Search CORE

69 research outputs found

Reinforcement Learning with Perturbed Rewards

Author: Li Bo
Liu Yang
Wang Jingkang
Publication venue
Publication date: 01/02/2020
Field of study

Recent studies have shown that reinforcement learning (RL) models are vulnerable in various noisy scenarios. For instance, the observed reward channel is often subject to noise in practice (e.g., when rewards are collected through sensors), and is therefore not credible. In addition, for applications such as robotics, a deep reinforcement learning (DRL) algorithm can be manipulated to produce arbitrary errors by receiving corrupted rewards. In this paper, we consider noisy RL problems with perturbed rewards, which can be approximated with a confusion matrix. We develop a robust RL framework that enables agents to learn in noisy environments where only perturbed rewards are observed. Our solution framework builds on existing RL/DRL algorithms and firstly addresses the biased noisy reward setting without any assumptions on the true distribution (e.g., zero-mean Gaussian noise as made in previous works). The core ideas of our solution include estimating a reward confusion matrix and defining a set of unbiased surrogate rewards. We prove the convergence and sample complexity of our approach. Extensive experiments on different DRL platforms show that trained policies based on our estimated surrogate reward can achieve higher expected rewards, and converge faster than existing baselines. For instance, the state-of-the-art PPO algorithm is able to obtain 84.6% and 80.8% improvements on average score for five Atari games, with error rates as 10% and 30% respectively.Comment: AAAI 2020 (Spotlight

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

RESEARCH ON THE EVALUATION SYSTEM CONSTRUCTION OF IDEOLOGICAL AND POLITICAL EDUCATION OF VOCATIONAL UNDERGRADUATE COURSES FROM THE PERSPECTIVE OF EDUCATIONAL MANAGEMENT PSYCHOLOGY

Author: Yang Jingkang
Publication venue
Publication date: 01/01/2022
Field of study

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

RESEARCH ON THE EVALUATION SYSTEM CONSTRUCTION OF IDEOLOGICAL AND POLITICAL EDUCATION OF VOCATIONAL UNDERGRADUATE COURSES FROM THE PERSPECTIVE OF EDUCATIONAL MANAGEMENT PSYCHOLOGY

Author: Yang Jingkang
Publication venue
Publication date: 01/01/2022
Field of study

HRČAK - Portal of Croatian Scientific and Professional Journals

On-Device Domain Generalization

Author: Liu Ziwei
Loy Chen Change
Yang Jingkang
Zang Yuhang
Zhang Yuanhan
Zhou Kaiyang
Publication venue
Publication date: 07/11/2022
Field of study

We present a systematic study of domain generalization (DG) for tiny neural networks. This problem is critical to on-device machine learning applications but has been overlooked in the literature where research has been merely focused on large models. Tiny neural networks have much fewer parameters and lower complexity and therefore should not be trained the same way as their large counterparts for DG applications. By conducting extensive experiments, we find that knowledge distillation (KD), a well-known technique for model compression, is much better for tackling the on-device DG problem than conventional DG methods. Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD. We further propose a method called out-of-distribution knowledge distillation (OKD) where the idea is to teach the student how the teacher handles out-of-distribution data synthesized via disruptive data augmentation. Without adding any extra parameter to the model -- hence keeping the deployment cost unchanged -- OKD significantly improves DG performance for tiny neural networks in a variety of on-device DG scenarios for image and speech applications. We also contribute a scalable approach for synthesizing visual domain shifts, along with a new suite of DG datasets to complement existing testbeds.Comment: Preprin

arXiv.org e-Print Archive

Panoptic Scene Graph Generation

Author: Ang Yi Zhe
Guo Zujin
Liu Ziwei
Yang Jingkang
Zhang Wayne
Zhou Kaiyang
Publication venue
Publication date: 22/07/2022
Field of study

Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i.e., objects are detected using bounding boxes followed by prediction of their pairwise relationships. We argue that such a paradigm causes several problems that impede the progress of the field. For instance, bounding box-based labels in current datasets usually contain redundant classes like hairs, and leave out background information that is crucial to the understanding of context. In this work, we introduce panoptic scene graph generation (PSG), a new problem task that requires the model to generate a more comprehensive scene graph representation based on panoptic segmentations rather than rigid bounding boxes. A high-quality PSG dataset, which contains 49k well-annotated overlapping images from COCO and Visual Genome, is created for the community to keep track of its progress. For benchmarking, we build four two-stage baselines, which are modified from classic methods in SGG, and two one-stage baselines called PSGTR and PSGFormer, which are based on the efficient Transformer-based detector, i.e., DETR. While PSGTR uses a set of queries to directly learn triplets, PSGFormer separately models the objects and relations in the form of queries from two Transformer decoders, followed by a prompting-like relation-object matching mechanism. In the end, we share insights on open challenges and future directions.Comment: Accepted to ECCV'22 (Paper ID #222, Final Score 2222). Project Page: https://psgdataset.org/. OpenPSG Codebase: https://github.com/Jingkang50/OpenPS

arXiv.org e-Print Archive

A proactive controller for human-driven robots based on force/motion observer mechanisms

Author: Huang Deqing
Li Yanan
Xia Jingkang
Yang Chenguang
Yang Lin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/01/2022
Field of study

This article investigates human-driven robots via physical interaction, which is enhanced by integrating the human partner's motion intention. A human motor control model is employed to estimate the human partner's motion intention. A system observer is developed to estimate the human's control input in this model, so that force sensing is not required. A robot controller is developed to incorporate the estimated human's motion intention, which makes the robot proactively follow the human partner's movements. Simulations and experiments on a physical robot are carried out to demonstrate the properties of our proposed controller

UWE Bristol Research Repository

Sussex Research Online

CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

Author: Bârsan Ioan Andrei
Chen Yun
Ma Wei-Chiu
Manivasagam Sivabalan
Urtasun Raquel
Wang Jingkang
Yang Anqi Joyce
Yang Ze
Publication venue
Publication date: 02/11/2023
Field of study

Realistic simulation is key to enabling safe and scalable development of % self-driving vehicles. A core component is simulating the sensors so that the entire autonomy system can be tested in simulation. Sensor simulation involves modeling traffic participants, such as vehicles, with high quality appearance and articulated geometry, and rendering them in real time. The self-driving industry has typically employed artists to build these assets. However, this is expensive, slow, and may not reflect reality. Instead, reconstructing assets automatically from sensor data collected in the wild would provide a better path to generating a diverse and large set with good real-world coverage. Nevertheless, current reconstruction approaches struggle on in-the-wild sensor data, due to its sparsity and noise. To tackle these issues, we present CADSim, which combines part-aware object-class priors via a small set of CAD models with differentiable rendering to automatically reconstruct vehicle geometry, including articulated wheels, with high-quality appearance. Our experiments show our method recovers more accurate shapes from sparse data compared to existing approaches. Importantly, it also trains and renders efficiently. We demonstrate our reconstructed vehicles in several applications, including accurate testing of autonomy perception systems.Comment: CoRL 2022. Project page: https://waabi.ai/cadsim

arXiv.org e-Print Archive

Large Language Models are Visual Reasoning Coordinators

Author: Chen Liangyu
Darrell Trevor
Keutzer Kurt
Li Bo
Li Chunyuan
Liu Ziwei
Shen Sheng
Yang Jingkang
Publication venue
Publication date: 23/10/2023
Field of study

Visual reasoning requires multimodal perception and commonsense cognition of the world. Recently, multiple vision-language models (VLMs) have been proposed with excellent commonsense reasoning ability in various domains. However, how to harness the collective power of these complementary VLMs is rarely explored. Existing methods like ensemble still struggle to aggregate these models with the desired higher-order communications. In this work, we propose Cola, a novel paradigm that coordinates multiple VLMs for visual reasoning. Our key insight is that a large language model (LLM) can efficiently coordinate multiple VLMs by facilitating natural language communication that leverages their distinct and complementary capabilities. Extensive experiments demonstrate that our instruction tuning variant, Cola-FT, achieves state-of-the-art performance on visual question answering (VQA), outside knowledge VQA, visual entailment, and visual spatial reasoning tasks. Moreover, we show that our in-context learning variant, Cola-Zero, exhibits competitive performance in zero and few-shot settings, without finetuning. Through systematic ablation studies and visualizations, we validate that a coordinator LLM indeed comprehends the instruction prompts as well as the separate functionalities of VLMs; it then coordinates them to enable impressive visual reasoning capabilities.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Recommended from our members

Spatial iterative learning control for robotic path learning

Author: Huang Deqing
Li Yanan
Xia Jingkang
Yang Lin
Zhou Xiaodong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/01/2022
Field of study

A spatial iterative learning control (sILC) method is proposed for a robot to learn a desired path in an unknown environment. When interacting with the environment, the robot initially starts with a predefined trajectory so an interaction force is generated. By assuming that the environment is subjected to fixed spatial constraints, a learning law is proposed to update the robot's reference trajectory so that a desired interaction force is achieved. Different from existing iterative learning control methods in the literature, this method does not require repeating the interaction with the environment in time, which relaxes the assumption of the environment and thus addresses the limits of the existing methods. With the rigorous convergence analysis, simulation and experimental results in two applications of surface exploration and teaching by demonstration illustrate the significance and feasibility of the proposed method

Sussex Research Online